Skip to content

Stop lazy alias resolver from overwriting real classes (#1662)#1689

Open
st0012 wants to merge 4 commits intomasterfrom
claude/investigate-rdoc-1662-I9ARb
Open

Stop lazy alias resolver from overwriting real classes (#1662)#1689
st0012 wants to merge 4 commits intomasterfrom
claude/investigate-rdoc-1662-I9ARb

Conversation

@st0012
Copy link
Copy Markdown
Member

@st0012 st0012 commented Apr 30, 2026

Fixes #1662.

Problem

The lazy alias resolver added in #1621 (Constant#is_alias_for's fall-through to find_alias_for) bypassed two safeguards that the explicit alias path has:

  • add_module_alias's collision guard against clobbering existing classes
  • The parser's document_self (:nodoc:) check

Net result: Foo = Bar # :nodoc: parsed alongside a real class Foo would silently overwrite the real class with an alias copy of Bar — which is how prism's Ripper = Prism::Translation::Ripper shim wiped Ripper's docs.

Fix

Split the API. Constant#is_alias_for is back to a pure accessor; the lazy lookup moves to a separate Constant#resolved_alias_target.

Make the lazy path opt-in. A new Constant#is_alias_for_path attribute is set by both parsers (prism via ConstantReadNode/ConstantPathNode, ripper via :on_const token info) using information they already had, so the regex-on-#value guess goes away.

Tighten update_aliases. It now:

  • Falls back to resolved_alias_target only when there's no explicit alias.
  • Persists the lazy result back onto the constant so Stats#report_constants and Constant#marshal_dump observe the relationship.
  • Gates the destination write behind a collision check that mirrors the BasicObject = BlankSlate guard in add_module_alias. The guard applies uniformly to both paths.

Compose the guard with pre-registration. For the unconditional guard to coexist with add_module_alias's practice of pre-registering an alias copy in classes_hash, add_module_alias now sets is_alias_for on that copy at creation time — so existing alias placeholders are distinguishable from real classes.

Tests

Each scenario asserts both that the real class wasn't clobbered and that the alias target keeps its own methods:

  • :nodoc: assignment
  • Real-class collision
  • The combined Ripper is overtaken by Prism::Translation::Ripper #1662 scenario (:nodoc: + collision)
  • :stopdoc: / :startdoc: workaround path
  • Explicit alias followed by a same-named class re-open
  • Forward-ref alias whose RHS carries a trailing :nodoc: comment

Both prism and (RDOC_USE_RIPPER_PARSER=1) ripper variants pass.

Docs

Companion commit refactors AGENTS.md's Architecture Notes to document the parser duality, the two-phase alias build, the lexical-scope limitation, and MARSHAL_VERSION discipline.

Followups (out of scope)

  • Collapse is_alias_for and is_alias_for_path into one field — needs MARSHAL_VERSION bump.
  • Capture lexical scope at parse time so resolved_alias_target doesn't lean on the syntactic parent chain (already known-approximate per Context#find_enclosing_module_named docstring).

@matzbot
Copy link
Copy Markdown
Collaborator

matzbot commented Apr 30, 2026

🚀 Preview deployment available at: https://d1e27c16.rdoc-6cd.pages.dev (commit: 485c8ed)

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes a regression where constant-alias handling could overwrite (“clobber”) documentation for a real class/module with an alias copy, especially when aliases are resolved lazily and/or :nodoc: directives are involved.

Changes:

  • Splits constant alias handling into an explicit accessor (Constant#is_alias_for) and a non-mutating lazy lookup (Constant#resolved_alias_target) based on a new parse-time field is_alias_for_path.
  • Updates Prism and Ripper parsers to record RHS constant-paths (is_alias_for_path) rather than re-deriving alias-ness from the textual value.
  • Adjusts ClassModule#update_aliases to use resolved_alias_target as a fallback and adds tests covering :nodoc: and collision scenarios.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
test/rdoc/parser/prism_ruby_test.rb Updates existing alias tests to the new API and adds regression tests for :nodoc: and collision/clobber scenarios.
test/rdoc/code_object/class_module_test.rb Adds unit tests asserting update_aliases won’t overwrite real classes and will skip :nodoc: constants.
lib/rdoc/parser/ripper_ruby.rb Records constant.is_alias_for_path when creating module aliases.
lib/rdoc/parser/prism_ruby.rb Propagates parse-time constant-path detection into add_constant(..., alias_path:) and uses it for alias registration.
lib/rdoc/code_object/constant.rb Introduces is_alias_for_path and resolved_alias_target, and makes is_alias_for a pure explicit accessor.
lib/rdoc/code_object/class_module.rb Uses the new lazy resolution fallback and adds a collision guard for the lazy path in update_aliases.
AGENTS.md Documents the dual-parser architecture, alias resolution phases, and marshal/ri compatibility constraints.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread lib/rdoc/code_object/class_module.rb Outdated
@st0012
Copy link
Copy Markdown
Member Author

st0012 commented May 1, 2026

@codex are you able to review this

claude added 2 commits May 1, 2026 08:31
Fixes #1662.

PR #1621 made RDoc::Constant#is_alias_for fall through to a lazy
find_alias_for lookup that returned whatever class the constant's value
named in the current store. The lazy result then flowed into
RDoc::ClassModule#update_aliases, which unconditionally wrote a dup'd
alias copy into @store.classes_hash, with two safeguards present
elsewhere bypassed:

* Context#add_module_alias refuses to clobber an existing class entry
  (the historic BasicObject = BlankSlate guard), but update_aliases
  did not.
* The prism parser only registers an alias when the constant has
  document_self set (so :nodoc: is honored), but the lazy resolver
  ignored it.

In combination these meant `Foo = Bar # :nodoc:`, where a real `Foo`
class was parsed elsewhere, would silently replace the real class's
documentation with an alias copy of `Bar` -- the literal failure mode
in #1662 (the prism shim's `Ripper = Prism::Translation::Ripper`
clobbering the real Ripper class docs).

Architectural fix
-----------------

Split RDoc::Constant#is_alias_for back into a pure accessor and a
separate Constant#resolved_alias_target lookup. is_alias_for now
returns only what was recorded explicitly (by Context#add_module_alias,
by ClassModule#update_aliases, or by ri marshal load) and never mutates
state. resolved_alias_target is the opportunistic forward-reference
lookup, used only by update_aliases as a fallback when no explicit
alias is recorded; it honors document_self so :nodoc: constants
don't lazy-resolve.

The lazy lookup itself uses a new Constant#is_alias_for_path attribute
populated by the parsers, instead of regex-matching #value at lookup
time. Both the prism parser (via ConstantReadNode/ConstantPathNode
detection in constant_path_string) and the legacy ripper parser (via
on_const-only token accumulation in parse_constant_body) already know
whether the RHS is a constant reference at parse time; we now propagate
that explicitly rather than rediscovering it from a stringly-typed
value.

Mechanical fix
--------------

ClassModule#update_aliases falls back to const.resolved_alias_target
when const.is_alias_for is unset, and gates the destination write
behind a collision check that mirrors the one in
Context#add_module_alias: skip if classes_hash/modules_hash already
holds a real (non-alias) class at the destination name. The guard now
applies uniformly to both the explicit and lazy paths.

When the fallback resolves a forward-reference target, the resolved
class is also written back to const.is_alias_for so downstream
consumers (RDoc::Stats#report_constants, RDoc::Constant#marshal_dump)
observe the alias relationship the lazy lookup found -- otherwise
forward-ref aliases get reported as undocumented and the alias slot is
serialized as nil into ri data.

For that guard to compose with add_module_alias's pre-registration
flow (which already places an alias copy at the destination expecting
update_aliases to overwrite it with the properly-marked version),
add_module_alias now sets is_alias_for on the alias copy at creation
time. Existing alias copies are therefore distinguishable from real
classes by the unconditional guard.

Tests
-----

Parser-level regressions cover :nodoc: assignment, real-class
collision, the combined :nodoc:-plus-collision scenario from #1662,
the :stopdoc:/:startdoc: workaround path, an explicit alias followed
by a same-named class re-open (which under the new guard preserves
both the alias target's methods and the methods added by the re-open),
and the post-Store#complete is_alias_for persistence on a
forward-reference alias. Each asserts both the negative (the would-be
alias didn't clobber) and the positive (the alias target keeps its
own methods and aliases list). Unit tests on update_aliases assert
the same. The two existing PR #1621 tests
(test_constant_alias_reverse_order, test_repeated_constant_alias) are
updated to exercise the renamed resolved_alias_target API; the
underlying forward-reference behavior is preserved.
Replace the bare two-line "Pluggable System" subsection of Architecture
Notes with three substantive subsections capturing invariants future
agents need:

* "Parsers and Generators" -- names the two Ruby parser classes
  (PrismRuby default, RipperRuby legacy), explains the
  RDocParserPrismTestCases mixin pattern, and notes that the ripper
  variant is gated by RDOC_USE_RIPPER_PARSER (so a local
  bundle exec rake only exercises prism by default; CI runs ripper in
  a separate job).
* "Code Object Model and Constant Aliases" -- describes the two-phase
  build (parse-time add_constant/add_module_alias vs.
  Store#complete -> ClassModule#update_aliases), the invariant that
  safeguards on one path must mirror on the other, and the known
  lexical-scope approximation in Context#find_enclosing_module_named.
* "Marshal / ri Data Compatibility" -- names the in-tree consumers
  (ri CLI in lib/rdoc/ri/driver.rb, ri --server servlet in
  lib/rdoc/ri/servlet.rb) and the per-class MARSHAL_VERSION discipline
  needed to keep locally-cached .ri data loadable across rdoc
  upgrades.
st0012 pushed a commit to st0012/rdoc that referenced this pull request May 1, 2026
In RDoc::Parser::RipperRuby#parse_constant_body, the on_const branch
only calls create_module_alias (and so only sets is_alias_for_path on
the constant) when the RHS const token is followed immediately by
on_nl or EOF. Anything else parsed as a separate token after the const
-- a trailing +# comment+, a blank line before another statement -- exits
the loop via a different branch with is_alias_for_path still nil.

Pre-ruby#1689 that didn't matter: Constant#is_alias_for fell back to a regex
match on +value+ at lookup time. This PR removed that fallback (split
into resolved_alias_target, which only fires when is_alias_for_path was
set at parse time), so under RDOC_USE_RIPPER_PARSER=1 those aliases
silently fail to resolve during Store#complete -- Stats reports them as
undocumented and ri data drops the alias slot.

Backstop with a regex check on the constant body text after
parse_constant_body returns: if the body is purely a constant path,
record it as is_alias_for_path. This mirrors the pre-PR
Constant#find_alias_for regex exactly, so behavior for ripper-only
inputs that worked before continues to work. Trailing-semicolon forms
(+Foo = Bar;+) remain unresolved because the body text includes the +;+
and so doesn't match -- that's a pre-existing limitation, not a fresh
regression. Prism is unaffected: its is_alias_for_path is set from the
AST node type, which doesn't depend on adjacent whitespace/comment
tokens.

The new test_constant_alias_with_trailing_comment_resolves_after_complete
runs against both parsers via the RDocParserPrismTestCases mixin and
fails on the ripper variant before this change.
Codex flagged a ripper-only path where Constant#is_alias_for_path stays
nil for +Foo = Bar # comment+: parse_constant_body's on_const branch
only fires create_module_alias when the const token is followed by
on_nl/EOF, so the trailing comment short-circuits the recording. Pre-PR
this was masked by Constant#find_alias_for's lookup-time regex on
+value+, which this PR removes.

Skip the actual fix -- the ripper parser is on its way out (loads with
a deprecation banner, opt-in via RDOC_USE_RIPPER_PARSER=1) and isn't
worth a backstop. Keep the test in the shared parser-mixin
(RDocParserPrismTestCases) so prism is still guarded against the same
shape, and omit it under the legacy ripper variant.
@st0012 st0012 force-pushed the claude/investigate-rdoc-1662-I9ARb branch from 00e904a to 1846a8a Compare May 2, 2026 01:15
@st0012 st0012 added the bug label May 2, 2026
@st0012 st0012 marked this pull request as ready for review May 2, 2026 01:46
Copilot AI review requested due to automatic review settings May 2, 2026 01:46
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread lib/rdoc/code_object/class_module.rb Outdated
Comment thread lib/rdoc/parser/prism_ruby.rb Outdated
* update_aliases persisted const.is_alias_for from the lazy
  resolved_alias_target lookup before the collision guard ran. When
  a real class already lived at the destination, the alias copy was
  correctly skipped but the constant was still mismarked, so
  Stats#report_constants and Constant#marshal_dump observed the
  alias relationship for an object the guard had just protected.
  Move the persistence below the guard so skipped aliases also skip
  the persistence.

* In the prism parser, add_constant resolved owner/name from the
  full constant path but called @container.add_module_alias on the
  outer lexical container. For Outer::Foo = Bar parsed at top level,
  this registered the alias copy as classes_hash['Foo'] (using the
  container's child_name) instead of 'Outer::Foo', leaving a stray
  top-level entry that update_aliases later duplicated under the
  correct namespace. Call owner.add_module_alias instead. The
  ripper parser already handles this correctly because parse_constant
  reassigns its container to the resolved nested module.

Tests:
* test_constant_alias_collision_does_not_mismark_constant_as_alias
  asserts Foo's constant ends up with is_alias_for=nil when a real
  Foo class already exists alongside Foo = Bar.
* test_qualified_constant_alias_registers_in_owner_namespace
  asserts Outer::Foo = Bar registers only at classes_hash['Outer::Foo']
  and does not leak a top-level Foo entry.
@st0012 st0012 force-pushed the claude/investigate-rdoc-1662-I9ARb branch from a0405da to 485c8ed Compare May 2, 2026 11:22
Copilot AI review requested due to automatic review settings May 2, 2026 11:22
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +119 to +122
def resolved_alias_target
return nil unless document_self
return nil unless @is_alias_for_path
parent.find_module_named(@is_alias_for_path)
end

def test_constant_alias_with_trailing_comment_resolves_after_complete
omit if accept_legacy_bug? # ripper parser is deprecated; gap not worth fixing
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Ripper is overtaken by Prism::Translation::Ripper

4 participants